A mean field method with correlations determined by linear response 
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We introduce an approximation based on the reconciliation of maximum entropy and linear re- 
sponse for correlations in the cluster variation method (CVM). Within a general formalism that 
includes previous mean- field methods, we derive formulas improving upon, e.g., the Bethe approx- 
imation and the Sessak-Monasson result at high temperature. Applying the method to direct and 
inverse Ising problems, we find improvements over standard implementations. 



INTRODUCTION: CLUSTER VARIATIONAL 
AND REGION BASED METHODS 



The cluster variational method (CVM) is a unifying 
framework for many approximation methods on graph- 
ical models, with variational parameters in correspon- 
dence with the marginal probability distributions one is 
interested in [H-Q. Fast and provably convergent meth- 
ods are known for the minimization of the CVM free en- 
ergy 0| , and systematic expansion methods about min- 
ima have been shown [H, 0] ■ Generalizations and convex 
approximations to CVM (Bethe approximation in par- 
ticular) have allowed for the development of fast and se- 
cure inference methods 0, ■ The use of linear response 
(LR) to improve further upon the correlation estimates 
is a recurring theme in science Another impor- 

tant application of LR has been in inverse problems, the 
methods have been applied, e.g., to infer protein folding 
structures and neural networks [13| |. 



In this Letter we develop an extension of the stan- 
dard method for fixing parameters in CVM, that allows 
the marginal probabilities to be made consistent with 
the LR estimates. Beginning with the Bethe free en- 
ergy, this method recovers the Sessak-Monasson expres- 
sion for correlation estimation [14} , and with an alter- 
native CVM approximation we improve upon the for- 
mula. Significantly the derivation of the expressions are 
from variational frameworks. We apply the method to 
homogeneous lattice models, and demonstrate improve- 
ments with respect to the standard implementation. We 
also apply the method to the inverse problem of esti- 
mating couplings, demonstrating results superior to the 
best mean-field methods for a range of temperatures. For 
brevity we will focus only on binary variables (spins), 
pairwise interactions and three standard region selection 
rules; but the principle we outline is flexible with respect 
to these criteria. The framework offers many avenues for 
improvement: e.g., many of the extensions outlined in 
introductory comments can be directly incorporated. 

A paradigmatic problem in physics is the determina- 
tion of the thermodynamics and marginal probabilities 
of a system with N spins {cr, = ±1} with a Hamiltonian 
determined by external fields H, and symmetric pair cou- 



plings (Ja = and Jij = Jji) 

%{a) = -} j Hi<7i - 2J Jij&i<Tj 



i<3 



The free energy 



PF(H,J) = -logTr exp(-/3H(<r)) , 



(1) 



(2) 



is in most cases computational intractable for moderate 
system sizes and the analytic solution is unknown in the 
large N limit, Tr denotes a summation over all variables 
in the expression. The cluster variational method (CVM) 
offers insight into approximations 0, 0] : the free energy 
is the sum of an energetic and entropic part f3F — fiE — S 



E(b, J,H) = - 2J Tr bijJijdij - 2J Tr biHi&i , 

i<j i 

S Q>) = -J2 CRTrbRl °z bR > ( 3 ) 



where R are subsets of variables, cr are integer count- 
ing/Mdbius numbers, and 6r(ctr) are beliefs over the 
variables in R (arguments omitted for brevity). A suf- 
ficient parameterization of the beliefs in terms of con- 
nected correlation parameters C is 



1 



1 



pGP(R) sep 



(4) 



where P{R) are non-empty partitions over R, and s are 
the elements (subsets) in each partition. The beliefs are 
by this choice normalized, and share parameters so as to 
be consistent on all marginals. We can interpret b^ as 
locally consistent probabilities provided < &r(<tr) < 
1. If the largest region R is the entire graph, Eq. @ 
becomes equivalent to Eq. j3j if: (A) we substitute the 
true connected correlations for the parameters; (B) we 
minimize the free energy with respect to the parameters, 
within the feasible parameter space. 

CVM is impractical with large regions, but it is well 
known that efficient approximations can result by select- 
ing only a subset of small regions, and determining the 
beliefs by maximum entropy. For graphs of special topol- 
ogy, or where out-of-region correlations are weak, these 
approximations are good or even exact. 
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A hierarchy of mean-field approximations can be re- 
covered: The naive mean field (NMF) approximation is 
achieved by selecting single- variable regions, 6y = 6^6, , 



Snmf (M) = ~J2Ys bi log bl 



(5) 



The only parameters are {Ci}, denoted by vector M. The 
Bethe approximation includes NMF regions, and adds 
one pair region for each non-zero coupling (Jy =^ 0), 
taking Sb — Snmf + ASb, the correction is 



AS B (M, C 2 ) = - ]T Tr hj log ( ^ 



(6) 



where C 2 denotes correlations over pair variables. The 
TAP free energy for independent coupling distributions 
can be reproduced by an expansion of the Bethe free 
energy in C 2 to quadratic order. One possibility beyond 
Bethe (that we call the Plaquette free energy) includes all 
plaquette regions. A plaquette is a closed loop of coupled 
variables, without chords; an ordered set of L R variables 
. . . ,i P ) such that J 4x ,i x+1 ^ (allowing i P+x = ii). 
The entropy correction from Bethe is 



AS P (M,C 2 +) 



p 



Tr bp log 



b P K 



1I.!.V'<. 



(7) 

where ap is the set {c^ : i £ P}, and C n+ is the vector 
of all correlations in at least n indices. 

If the approximate CVM free energy is minimized in 
the standard approach, parameters are chosen to meet 
the saddle-point equations 



dF 



0, 



(8) 



with a positive definite Hessian. From linear response 
(LR) about the minima the connected correlation on sub- 
set s may be approximated as 



d^F(M,C 2+ ) 



(9) 



Except for cases where the free energy is exact the pa- 
rameters C2+, and LR estimates X2+, differ. 

By a simple argument 15[ it is expected that param- 
eters fixed by the saddle-point criteria will be poorer es- 
timates to the true correlations than those estimated by 
LR about the saddle-point. Indeed CVM is fundamen- 
tally a variational method, inducing an approximation 
to the probability measure P(a) = 

Pexact 

Since 0(e d ) errors arise in quantity determined by d th 
derivatives of the free energy, the correlation parameters 
determined by first derivative conditions are of lower fi- 
delity than the the LR estimates obtained by higher or- 
der derivatives. However, by interpreting x as t ne true 



correlations we implicitly accept the inaccuracy of b R as 
correct marginal probabilities at the saddle-point. 
We require consistency between beliefs and responses 



C2+ — X2+ • 



(10) 



To implement this constraint we can invoke a modified 
free energy, with slack parameters A 

F(H, J, M, C 2 +, A) = F(H, J, M, C 2+ ) + A • C 2+ . (11) 

Choosing A to satisfy ([8]), leaves C 2 + unconstrained by 
the saddle-point equations, such that we can fix it self- 
consistently from (j9]) and (fT0|). We will treat A as invari- 
ant parameters with respect to H. A is large for unsuit- 
able region selection, and A = where CVM is exact. 



A GENERAL FRAMEWORK FOR LINEAR 
RESPONSE 

In our method a minimum of the free energy is first 
determined. The saddle-point equation for Mi is 

= -Hi - J ij M o + atanh(Mi) + A t (M, C 2+ ) , 



Ai(M,C 2+ )=Tr c R ^b RV log 



:i2) 



where bp^ is the belief over the region excluding i. It 
includes the NMF result and the correction Ai(M, C 2 +). 
Each element in C 2 +, labeled by its index set s, gives a 
saddle-point equation 

J s - X s =TrJ^ cpY[^-bp\ s (a R \ s ) log (b R (a R )) , 

R:sGR iGs 

(13) 

J s = for subsets \s\ > 2. Eqs. (|12I13[) fix a subset of 
parameters C, A: in our method we fix A and M, whereas 
the standard approach fixes M and C 2 + (and A = 0) . 

Going forward we assume this is a well-defined min- 
imum in the variational parameters (C): differentiable 
and not on the boundary of the feasible parameter space. 
In response to a variation in the field H — >• H + eSH the 
variational parameters are perturbed C — > C + e5C, a 
quadratic order free energy describes the fluctuation 

F(H + eSH, J, C, A) + e 2 6CQSC/2 - e 2 6HLSC + 0(e 3 ) . 

The responses 8C are determined by QSC = LSH. Elim- 
inating 5G 2 +, and identifying SMi = J2z Xiz&H z as linear 
responses, we can write a system of N 2 equations 



bc 



- ( 8J ii + $ ii (M,C 2+ ) 



(14) 



If the arguments C3+ in <I> are known, then Eqs. (110112114(1 

form a closed set for the determination of C 2 , M and A. 
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For the NMF and Bethe approximation $ is indepen- 
dent of C3+, but for the Plaquette approximation these 
should be determined. In practice several well reasoned 
methods to fix are available, we restrict attention 
henceforth to scenarios where C3+ are known exactly 
(e.g. by symmetry). 



COMPARISON OF METHODS 



Equation (|14l) applies regardless of how parameters are 
fixed, it is inclusive of standard LR methods [HI fli| . 
Standard (A = 0) linear response results are reproduced 
by solving (|12ll3p without requiring (fT0|) . 

In our new approach, beginning from the Bethe ap- 
proximation, we determine the off-diagonal matrix com- 



J 



a*5 = E c 



ijk 



(Cik — CjkCij)(Cjk — CikCij) 



(1 



cm 



ponents in Eq. (fT4]) to be, for edge regions ij included in 
the approximation, 



J 



ip, 



(l-Mf)(l-Mf)-Cf 



where J IP is the independent pair approximation 



J IP (b Z3 ) = Tr 



log b t 



(15) 



(16) 



Using $^ in Eq. (JT3J| the Sessak-Monasson [14| result 
for small correlation parameters is recovered. For the 
plaquette method we have corrections $ p = $ B + A$ p . 
In the simplest case with triangular plaquettes, and spin- 
symmetric probabilities {Mi — 0,Cijk = 0), 



C?k + 2Cj k C ik Cij - C.) 



\ log 



(1 - c %j f{(\ + c l3 y - (c jk + c ik yy 



(i 



Qi) 2 



(Q fc -C jfc ) 2 ). 



where ctj k — 1(0) for included (excluded) plaquettes. For 
a fully connected model we can consider the leading order 
errors in the high temperature regime by an expansion in 
p. For the symmetric case, an approximation inclusive 
of all edge regions yields a correction in $y 

•i>" •K.r" E 2^44 +o(/? 6 ), a?) 

to be compared to the 0(/3 4 ) error in standard Bethe. 
For the Plaquette approximation, including all triplet re- 
gions, the error is further improved to 



to (jiop . (fT2j) and (fT3|) . A possible iterative scheme is 



rf.exa.ct 

ij ij 



P 7 2J/-; [JijJ k i(Ji k Jji + JuJjk) 2 + 
k<l&i,j) 

2Ji k JjkJilJjl{JijJkl + JikJjl + JilJjk)] + 0((3 S ) , (18) 

to be compared to 0((3 5 ) for the standard implementa- 
tion of linear response on plaquette regions. The error in 
(fT8|) is evaluated with the exact C, which is pertinent to 
an inverse problem application. If instead C2+ is deter- 
mined self-consistently by (ITU1) and (THJ the coefficient of 
f3 7 contains additional terms, but it remains of the same 
order in f3. Instead in (flTT) the coefficient of 0(f3 5 ) is 
unchanged for C determined exactly or by saddle-point. 



Direct problem 

The direct problem of determining magnetizations and 
correlations given H,J requires the simultaneous solution 



C\ +1 



M 



t+i 



[-(3J + ^(M t ,ClC 3+ )] 1 ; 

tanh [P(H + JM l ) - A(M 4 , C\, C 3 +)] 



(19) 



At sufficiently high temperature the scheme is conver- 
gent, and the solution stable. However, at lower temper- 
atures the process may be unconvergent, and so strong 
damping and/or special ordering is required. 

We find our method to be promising for models with 
short significant loops. The homogeneous triangular lat- 
tice model (HTL) with H = 0, Jij = 1 for nearest 
neighbors and otherwise, is a well understood canon- 
ical model that has a ferromagnetic transition point at 
(3 C = 0.275 [l7j], and is fully frustrated for /3 < 0, with no 
long range order [l8|]. For finite lattice implementations 
we choose periodic boundary conditions (periodicity L). 
Figure Q] shows the corresponding region based approxi- 
mations in the direct problem. 




NMF 



Bethe 



Plaquette nn 



FIG. 1: Regions and counting numbers for a triangular lat- 
tice, {a, c} are nearest neighbors (nn), {a, b} are next-nearest 
neighbors (nnn). We abbreviate x an d C indices accordingly. 

Fig.[5]shows nearest neighbor correlation estimates ob- 
tained in the thermodynamic limit by Fourier techniques. 
We show the exact result by the black line, standard LR 
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FIG. 2: (color online) Nearest neighbors correlation estimates 
for the HTL L — > oo. Magnetized branches are shown only for 
Bethe and Plaquette (A = 0, ft > 0) correlation parameters. 





FIG. 3: Next nearest neighbor correlations for the HTL L = 5, 
unmagnetized branches. Curves as figure [2] 



methods in red (label x), standard methods minimizing 
F in the variational parameters in green (label C) and 
our new method in blue. All methods perform well at 
high temperature (small |/3|), and magnetized solutions 
perform well for /3 3> /3 C . Standard methods undergo con- 
tinuous transitions for ft ~ j3 c , and in some cases even for 
P < 0. Linear response estimates diverge at these points. 
The Standard (A = 0) Plaquette method performs well 
in the estimate of C nn , for the unmagnetized solutions, 
but only in the range @ £ (—1.01,0.255), while our new 
Plaquette method performs well in the entire frustrated 
region P < (see the inset of Fig. [2]). The unmagnetized 
solution does not exhibit continuous phase transitions for 
the new methods. At low temperature convergence prob- 
lems hinder the construction of solutions, the unmagne- 
tized Plaquette solution is constructed only for P < 0.3, 
certainly this solution disappears at P = 0.35,where the 
Hessian becomes singular for any C nn < 1. 

Although standard methods (A = 0) may well find cor- 
relation parameters C inside a plaquette, linear response 
is required to determine correlations outside plaquette re- 
gions. For P £ (— oo,p c ] the new method improves upon 
standard implementations for significant terms in x an d 
X 1 - Figure [3] shows the next nearest correlations calcu- 
lated on a finite model [L = 5 values are infact close to 
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FIG. 4: THL: error in inference of J from exact statistics. 
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FIG. 5: Error in inferring couplings J for a diluted 2D square 
ferromagnet, from statistics of 10 independent samples. [KR] 
employs the Kappen- Rodriguez normalization [ljj. 



L — > oo for p < p c ), the new method estimates are su- 
perior to their counterparts at all p in the unmagnetized 
branches. Note that whereas the standard implementa- 
tion has a Hessian independent of L, the new method is 
sensitive to the boundary conditions: the unmagnetized 
Bethe and Plaquette approximations are locally unsta- 
ble for P < —0.68 and P < —1.17 respectively when 
L = 5, the Plaquette approximations remains more sta- 
ble against symmetry breaking than the standard imple- 
mentation. 



Inverse problem 

A simpler application of our method is for the in- 
verse problem: given sample statistics, determine J and 
H [3 EH ■ With ignorance of the distribution of cou- 
plings, we should select all edge regions for Bethe, and 
all triangular regions for Plaquette, approximations. In 
the new method we take x — C equal to the correlation 
statistics, and then impose (fl~2"|) and off-diagonal elements 
of (p~4)) . In standard mean field applications x is deter- 
mined from the statistics, but C2+ obey the saddle point 
equations (TT3"|) : the equations are solvable for Bethe and 
lower order methods [161 ] . 

Figure U demonstrates the results for estimation of ma- 
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trix J in the HTL L = 5 based on perfect data. The im- 
proved scaling at small |/3| is as anticipated in Eqs. (1T71) . 
(|18[) . However, even at low temperature reconstruction 
is significantly improved by the new methods. Although 
Qnn determines the error, note that the approximation is 
different to that used in the direct problem, the 2D trian- 
gular structure is discovered, unlike in the direct problem 
where it is assumed in the region selection. 

Figure [5] demonstrates results for an instance of a 7 by 
7 diluted square lattice Ising model in zero field. Each 
coupling is assigned according to the probability distri- 
bution P(J) = 0.7<5j,i + 0.3(5,7,0- We took M = 0, and 
generated the pair-correlation matrix from independent 
Monte Carlo measures. Sampling error limits all algo- 
rithm performance for small /3. When j3 is large enough 
the error of the method exceeds sampling one. A /3 inter- 
val in which the new methods perform well is apparent. 
The triangular Plaquette approximation improves over 
Bethe, despite the absense of triangles in the model (the 
shortest loop is of length 4). For larger j3 the model 
undergoes a rapid growth in correlation length, far be- 
yond the edge/triangular regions selected, all mean- field 
methods are prone to significant errors. 

This work is supported by the Italian Research Minis- 
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